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MARX W. WARTOFSKY 


RULES AND REPRESENTATION: THE VIRTUES OF 
CONSTANCY AND FIDELITY PUT IN PERSPECTIVE 


In this paper I will argue that the widely accepted theory of perceptual 
constancy, and the equally widely held account of fidelity in representa- 
tion rest on the same mistake. My argument (which derives from and 
extends Nelson Goodman’s, in Languages of Art) is that this theory of 
perceptual constancy is based on a theory of vision which is false, namely, 
the standard theory which interprets vision on the model of Euclidean 
geometrical optics. Further, I will argue that the view which takes the 
rules of linear perspective to be the norm for fidelity in pictorial represen- 
tation is based on the same false theory (or on its mirror image, that is, on 
the theory that pictorial representation’s norm for fidelity is the mirror 
image). 

My argument, in brief, is that the theory of perceptual constancy is 
required to explain perceptual invariance of such features as size, shape, 
etc., only because it construes our perception of objects and scenes in the 
three-dimensional world as if it were perception of a two-dimensional 
image of the world. I hope to show that the theory of constancy is 
redundant: that it is required only in order to correct for putative 
mistakes which the theory itself mistakenly postulates as features of our 
perception. The theory of perceptual constancy thus invents the vice with 
respect to which it is the complementary virtue. In effect, I hope to show 
that the theory cancels itself out, and is therefore redundant, once rightly 
viewed. Moreover, since the account of representational fidelity as 
defined by the rules of linear perspective is interdependent with con- 
stancy theory, this notion of fidelity also stands or falls with the theory. To 
give way to metaphor for the moment, constancy and fidelity are parallel 
virtues. Put in perspective, they will be seen to converge. Let me begin to 
put matters in perspective then, first, by considering the standard view, 
and then, by presenting my argument against it. 

The theory of perceptual constancy tells us that our perception pre- 
serves such features as size and shape invariant through changing appear- 
ances. The rules of linear perspective, in drawing, tell us how to render 
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these changing appearances with fidelity so that constancy can do its work 
on two-dimensional pictures just in the same way that it does on three- 
dimensional objects and scenes. Constancy and fidelity are held to be 
virtues in that they give us the rules for correct perceptual interpretation 
and correct representation, and thus permit us to see things as they really 
are, and to represent them as they really appear, without error. Con- 
stancy preserves reality, while fidelity preserves appearances. They are, 
in effect, inverse virtues. What constancy corrects, according to rule, is 
just what fidelity presents for such correction. 

The virtue of constancy, in perception, is that it corrects or regulates 
putative perceptual mistakes. Or so it is alleged. The theory of constancy 
is thus a normative theory, in the face of what, without it, would result in 
perceptual error. Thus, shape or size constancy 1s the perceptual system’s 
way of setting right what is presented in a misleading way. Tilted circles 
‘appear’ or are presented to our visual apparatus as ellipses. Constancy 
stretches them back into shape, so that we recognize the presented 
elliptical shapes as tilted circles, as fully round as circles ought to be. 
Things in the distance ‘appear’ smaller, or project a smaller retinal image 
than do the same things closer up. Constancy preserves size through such 
phenomenal or physiological variations, and we are said to ‘infer’ varia- 
tions in distance, as the correct reading of apparent size variation. 

Fidelity is the virtue of constancy transferred from three-dimensional 
vision to two-dimensional pictorial representation. Fidelity, thus, is a 
virtue in representation which preserves in the appearances what con- 
stancy preserves in ordinary perception. If perceptual constancy is the 
perceptual system’s rule for right readings, then fidelity is assured by the 
rule of representation which delivers up the text for the right reading 
according to the perceptual rule. Or so it seems. 

This abbreviated version of the traditional view of perceptual con- 
stancy and representational fidelity raises some conceptual questions, 
which it may be useful to discuss at the outset. First, a number of brief 
clarifications and qualifications: 

(a) The whole account of perceptual constancy presupposes a distinc- 
tion between what is received by the visual apparatus, and what is 
perceived. The classical distinction between sensation and perception, 
between the ‘givens’ in perception and our perceptual ‘judgments,’ is 
crucial to the notion of constancy. What is received is ostensibly the 
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bundle of light rays reflected from surfaces. The external medium (air) is 
presumed to have no effect on the geometry of the projection, so that in 
linear perspective at least, the conditions of Euclidean space are pre- 
served. (In aerial perspective, where color and dark-light gradients are 
affected by distance, matters are of course different.) In the projection of 
the bundle of light rays, variations in size and shape of projected figures 
follow the transformations of the Euclidean geometry i.e. of classical 
geometrical optics. Whatever refraction there is, is similarly described in 
terms of classical dioptrics, so that lenses, for example (or more generally, 
media with differential refractive indices) bend the received light rays. 
And in the case of the crystalline biconvex lens of the eye, the flux of light 
rays is converged, and at the retinal distance from the lens, forms an 
inverted image, or point-for-point projection on the retinal surface. This 
image varies in size, inversely with the distance between eye and object; 
and varies in shape with variation in the angle of incidence of the light rays 
with the plane upon which the image is projected. ' 

The distinction therefore is one between the optical image on the retina 
and the object or scene of which it is the projected image. More precisely, 
the distinction is one between certain features or properties of the image 
and of the perceived object, e.g. size and shape. The presumption that a 
corrective operation is required in visual perception is based on this 
difference: for the ‘given’, or the stimulus-information is taken to be the 
retinal image itself. Once this premise is accepted, then of course the 
preservation of the shape or size of perceived objects through variations 
in the retinal image requires some mental or psychological operation — 
what Helmholtz called ‘unbewusster Schluss,’ an unconscious inference. 
The fundamental assumption of the constancy-theory is therefore that 
our access to the three-dimensional visual world is mediated by a 
two-dimensional image, which is a projective transformation of the 
bundle of light rays, by convergence, upon a two-dimensional surface. It 
is easy to understand why this model lends itself easily to the interpreta- 
tion of perception as a mode of reflection upon an internal ‘picture’ of the 
world; and therefore, that the perception of the world is like picture- 
perception, in some sense. ” 

But what I take here to be the error of constancy-theory - (the 
interpretation of the binocular three-dimensional vision of a moving and 
acting subject as if it were the monocular two-dimensional vision of a 
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fixed subject) — is an instructive error. For constancy theory, though itis a 
false theory of vision, turns out on a different interpretation, to suggest an 
alternative theory, like my own. Namely, if the constancy transformation 
is required, by the theory, to preserve invariances through ‘phenomenal’ 
or physiological (retinal) variation in what is ‘given’, or presented to the 
eye; and if what is given is a two-dimensional mapping or ‘picture’ of the 
visual world, then the implicit claim is that our seeing is mediated by such 
a ‘picture.’ My own argument is just that such a ‘picture’ does mediate our 
vision, but that it 1s not the ‘picture’ given in sensation, or by the retina. 
Rather, it is the actual mode of picturing which we engage in, in making 
pictorial representations, which performs this mediating function, and to 
which (different) constancy theories are therefore relevant (depending on 
different rules of transformation in representation.) Construed in this 
way, constancy theory is not a theory of visual perception per se, but really 
(and unwittingly) a theory of picture-perception! It follows, therefore, 
that a given theory of constancy defines a criterion of fidelity, as its 
concomitant, namely just that rule of perspective representation which 
presents the group of transformations characterized by the invariance 
which the (particular) constancy theory proposes. 

The group of transformations which defines linear perspective as the 
norm of fidelity in representation is just that of Euclidean geometry (as a 
formal mathematical system) and its interpretation for light rays, reflec- 
tion, and refraction, in Euclidean geometrical optics. My argument is 
therefore that it is the theory of vision which interprets the visual system 
in terms of Euclidean geometrical optics which serves as the theoretical 
warrant for the fidelity of perspective representation. This theory there- 
fore not only organizes the empirical evidence in a given way, but also sets 
the framework for what will be taken as evidence, and what will count as 
experimental data. It therefore organizes inquiry on the model of the 
theory. For example, the theory defines light-rays as straight lines, and 
defines image formation through projective transformations in terms of 
Euclidean point-for-point mappings of the intersections of such pro- 
jected rays with plane surfaces. If the experimental evidence shows that 
such a mapping does not take place in the physiological projection on the 
retina, for example, then the theory is false. But in fact, the retina is not an 
undifferentiated plane surface. It is a differentiated curved surface 
(differentiated by the distribution of various arrays of receptor-neurons, 
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e.g. for slant, edge, color, etc.).* The theory is therefore false on these 
grounds alone. But it is also phenomenologically false, if it is taken not 
simply as a theory of the physical (or physiological) geometry of vision, 
but as an explanation of what we see. For pictorial representation in 
accordance with strict linear perspective looks wrong.* Now this is a 
tricky point. For if J claim that a picture ‘looks wrong’ when it is made in 
accordance with an unmodified rule of Euclidean projection, then it 
would seem I am appealing to some pre-pictorial criterion of visual 
fidelity — e.g. perception per se — and this would be in contradiction to my 
hypothesis that our vision is mediated by representation. Here, instead, it 
would seem that unmediated vision becomes the test of fidelity. The 
answer to this objection lies, I think, in the fact that our vision is never 
simply the product of a given norm of representation, but is a complex 
process mediated by a group of norms, some deriving more directly from 
our biological and practical activity (e.g. from the physiological basis of 
our visual system, or from the forms or structures of our motor-activity, 
or our non-pictorial praxis) and some from the different norms of 
representation which have developed historically, and which form, so to 
speak, our visual heritage, or parentage. The fact that painters have, from 
the start, modified the rules of strictly geometrical linear perspective 
bespeaks a certain autonomy in the choice of norms, and 1n the modifica- 
tion and elaboration of norms in our representational practice itself. But 
this is the subject for a different inquiry. 

My argument against the standard view is presented in the context of 
Nelson Goodman’s discussion of perspective in Languages of Art. Good- 
man has presented the most striking and challenging argument against 
the standard view. 

In the brief section on perspective in Languages of Art (pp. 10-19), 
Nelson Goodman argues that ‘‘Pictures in perspective, like any others, 
have to be read; and the ability to read has to be acquired.” (p. 14) 
Against the standard argument that perspective representation dupli- 
cates a bundle of light rays reflected from the object itself, under specified 
conditions, Goodman argues that the “‘specified conditions” (e.g. fixated 
monocular vision through a peephole, frontal-parallel presentation, at a 
certain distance, etc.) are so “‘grossly abnormal” that “‘to measure fidelity 
in terms of rays directed at a closed eye would be no more absurd.” (p. 13) 
In short, Goodman’s argument is that the claim that two-dimensional 
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perspective representation delivers the same visual information as that of 
the three-dimensional objects represented, is based on such a contrived 
mode of presentation, and such an impoverished and abstracted mode of 
vision, that the ‘likeness’ or ‘fidelity’ of image to object is itself a construct 
no less subject to rules of representation than any other non -perspectival 
representation. It is just that the rules are different. 

Now the reason such an argument as Goodman offers seems perverse 1s 
that our own familiar canon of fidelity in representation is that of linear 
perspective. I have argued, (in ‘Pictures, Representation and the 
Understanding,’’) that the adoption of this canon is an historical act, 
which involves the adoption and interpretation of Euclidean geometrical 
optics as a theory of vision; and that this theory 1s false. Further, that our 
seeing is itself not simply a physiological, but a social and cultural activity, 
and that our adopted modes of representation guide our seeing itself. 
‘“‘We see by way of our picturing,”’ I said there; and meant by it that we 
come to ‘see’ tilted circles as ellipses only because we have come to 
represent them as ellipses in our adoption of perspective as a canon of 
representation. In fact, we continue to see tilted circles as tilted circles 
except when we are asked to represent them; and then we represent them 
as ellipses in order to make them look like tilted circles suitably drawn in 
accordance with the rules of perspective (or the theory of geometrical 
optics as a theory of projection of reflected light rays upon the retina). 

Here, I should like to present an even sharper and more perverse 
sounding argument, which follows, I believe, from Goodman’s account; 
but which offers a radical reinterpretation of the phenomena of visual 
constancy — in particular, shape and size constancy.” It is this: that the 
very discovery or description of the phenomena of constancy cannot 
make sense except as an interpretation according to the rules of perspec- 
tive (or the theory of vision based on geometrical optics). And that, in 
fact, the very allegation that we ‘preseve constancy’ through projective 
transformation is not an account of the way the eye or our visual system 
works, but is rather dependent on the adoption of a false theory of vision, 


* 1 deal in this paper only with shape and size constancy. Whether my argument can be 
interpreted for object or color constancies, or for others, I do not know as yet. But it would 
be a different argument, since what is at issue here is the perspective transformation of size 
and shape, and as yet, nothing more. 
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linked to a convention of representation (which is itself neither true nor 
false). 

The standard view, for example, holds that parallel lines ‘visually’ 
converge, when presented in anything but the frontal-parallel plane (and 
even there, as their distance above or below the line of sight increases). 
Thus, railroad tracks appear to converge in the distance. By the adjust- 
ment we make (e.g. by ‘unconscious inference,’ in Helmholtz’s classical 
phrase), we ‘correct’ for the apparent visual convergence, and interpret 
the visual (i.e. retinal) image of converging lines as parallel, but as going 
off into the distance. 

My argument is that we do not make any ‘correction’ or ‘inference’ at 
all; but that parallel lines going off into the distance appear, in normal 
binocular vision, to be just what they are — parallel lines going off into the 
distance, without convergence. Constancy does not need to be preserved. 
It is given. Yet, all of us can see parallel lines as converging in the 
distance. That is to say, we can willfully violate the given constancy. And I 
claim that we can do so only because we have adopted a perspective mode 
of representing parallel lines as converging, in our pictorial, or two- 
dimensional representation of such lines going off into the distance. But 
why should we adopt such a mode of representation? The standard 
argument says that this reproduces or matches the visual information 
delivered to the eye, i.e. it reproduces the retinal projection of the 
reflected bundle of light rays from the objects themselves; and that this in 
fact can be measured. Goodman’s argument, I take it, counts against that, 
and I will assume it here, for the moment, in order to go further. But I will 
return to it later, in the light of recent psychological studies of picture 
perception, and in particular J. J. Gibson’s alternative account. Why do 
we, then, come to adopt the perspective representation of converging 
lines as canonical? The answer requires that we play some tricks with 
mirrors. 

Mirrors are strange, because deceptive. If we strip ourselves of the 
historical familiarity with mirror images for the moment - 1.e. ‘bracket’ 
our easy accommodation of the mirror image — then we may recapture the 
magic of the mirror: it delivers a three-dimensional image from a 
two-dimensional surface. We discover the mistake of identifying a mirror 
image with the object imaged when we bump into the mirror. Kept from 
bumping, and from surround-cues (frame, blemishes or distortions or 
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glossiness or glaze-effects which reveal the mirror surface, etc.), we may 
be fully deceived. In fact, on Goodman’s analysis of metaphor, by some 
small stretching, mirror images may be seen as metaphors, in that any 
image taken for the object imaged is a label, i.e. a non-verbal visual 
metaphor, taken for something it is not. The mirror image is a peculiar 
‘metaphor,’ however, in that it is hardly a ‘“‘calculated category mistake”’ 
as Goodman says of verbal metaphors (p. 73). 

In any case, it is not true that mirrors do not lie. They assuredly do, 
when we do not recognize the images in them (or ‘on’ them) as mirror 
images; that is, when we mistake what appears on (or from) a two- 
dimensional surface as a three-dimensional object. Recognizing the 
mirror as a mirror and the image as an image requires some small 
sophistication. 

Now add to this small sophistication the rather grand move of inter- 
preting the eye as a mirror of objects, and of doing so not only naively, but 
theoretically. That is, consider that the mathematical theory of Euclidean 
projective geometry, suitably interpreted as the classical theory of optics 
(with rectilinear propagation of rays, reflection and refraction of light rays 
from plane or curved surfaces, and convergence of such rays through 
pinholes or lenses) is taken as the theoretical account of image-formation. 
The result is that the two-dimensional mirror image will be taken to be 
the same sort of image as that projected on the ‘mirror’ of the eye. And if 
the mirror image is itself to be represented, that representation will be 
formed by duplicating the surface image on the mirror. Literally, the 
mirror image can be transferred to a pictorial representation either by 
tracing it directly (e.g. with a crayon on the mirror’s surface) or by 
projecting such an image through a lens or a pinhole onto a plane surface, 
and tracing it there; or, as Leonardo Da Vinci proposed, tracing it on a 
transparent pane of glass held before the eye. This is, in effect, what the 
camera obscura provided for the Renaissance. Thus, Brunelleschi’s 
injunction to ‘“‘draw it the way it looks in the mirror” is the craftsman’s 
practical version of incorporating the laws of geometrical optics into the 
rules of pictorial representation. 

Plainly, when the parallel lines going off into the distance are mirrored 
or projected on a plane surface, they do converge; and so they are 
represented, in perspective drawing. But my argument is that they do not 
appear to.converge in three-dimensional binocular vision, except when 
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we have learned to transfer the convergence adopted in perspective 
representation to our actual abstractive vision of the three-dimensional 
world. Parallel lines appear to converge only after we have learned to see 
by way of our picturing, and do so only when we are able to translate our 
vision into the language of our representation. That is to say, the 
historical adoption of a rule of representation affects our perception 
itself, to the degree that we can now ‘see’ the convergence when we put 
ourselves into the framework of the new rule. The broader argument, 
which I develop elsewhere, is that our modes of perception change 
historically, in accordance with changes in the modes of our social or 
cultural practice — in this case, the practice of pictorial representation. 
This is an aspect of a general view which I call historical epistemology. ° 

What does this do to the usual account of perceptual constancy in this 
case? I would argue that perceptual constancy can be introduced as a 
concept, and indeed, becomes an actual perceptual phenomenon, only 
when what has to be ‘corrected’ for is our pictorial mode of perspective 
representation formed in accordance with a theory of vision based on 
Euclidean geometrical optics. 

Now this is a fairly strong claim. In its strongest version, it alleges that 
there is no visual processing of variable imputs which preserves con- 
stancy, and that, in our normal vision, none is needed. This would be 
tantamount to saying that naive vision is correct, and thus needs no 
correction by a constancy transformation. But of course, there is no such 
thing as naive vision. The very evolution of the eye as an adapted visual 
mechanism embodies those processes and functions which vision has to 
serve. Since, naively speaking, we want to account for the visual ability to 
recognize the same object in variable presentation, (i.e. through varia- 
tions in angle of sight, distance, light conditions, surround, etc.), we have 
to build into our theory of vision an account of how such identification is 
possible. But the demand can be made only if this identity is put in 
question by variation. And to the ‘naive’ (i.e. highly complex, adapted, 
evolved) eye, it is not put in question. For the ‘naive’ eye is evolved 
precisely to preserve this identity. At most, it would seem, object and 
shape constancy, if they are not innate, are acquired very early, among 
human infants,’ and seem to be innate for criterial and functional objects 
and shapes among animals. * What puts constancy in question, then, is our 
own sophistication, and our science. We are able to abstract from our 
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visual activity; but only because we have acquired an ability to represent 
three-dimensional objects and scenes visually on a two-dimensional 
surface; and because we have learned to construct theories of vision 
based on this activity. It is the abstractive practice of representation which 
introduces the possibility of visual abstraction in ordinary perception of 
objects. We can ‘see’ the variations in shape and size which objects 
present to us only because ‘shape’ and ‘size’ have become separable, i.e. 
abstractible visual concepts, by means of our practice of representation. 
Apparent convergence of parallel lines in the distance, diminution of size 
with distance, and other phenomenal variations which require the 
mechanism of constancy to explain the veridicality of our vision, are all 
phenomena we would not experience, were it not for our ability to 
represent things ‘the way they look in the mirror.’ The very notion of 
‘phenomenon’ or of ‘the appearance of things,’ by contrast to the way 
they ‘really are,’ is a cognitive and perceptual act of abstraction, not built 
into the perceptual apparatus, but achieved by reflection and inquiry. 

Now this is not to argue that we should not represent things ‘the way 
they look in the mrrror.’ It is rather to acknowledge that we do, but that 
we do not always, and that our choice to represent them in this way is an 
historical choice, conditioned by particular norms adopted for particular 
purposes. This does not yet settle the question as to whether this choice — 
1.e. the choice of the rules of perspective representation — is a choice of the 
‘correct’ rendering of our visual world. It does not settle the question 
because my argument puts in doubt the very way in thich the question is 
raised. 

The very notion of ‘correct rendering’ presupposes not simply that 
there is a norm, but that among alternative norms, one ought to be 
adopted because it yields the ‘correct rendering’. But this further presup- 
poses that the choice of one among alternative norms of representation — 
e.g. of linear perspective over its alternatives — itself is determined by 
some norm ~— a metanorm, or norm of norms. As in the question of choice 
among alternative theories in science, this question also asks for a 
criterion of choice (or of acceptance or rejection). To say, as I do, that the 
choice is ‘historical’ leaves it apparently norm-less, i.e. seems to concede 
to an historical relativism with respect to ‘correct rendering’ or ‘fidelity’; 
to a conventionalism, with respect to the nature of norms; and, at best, to 
a pragmatic criterion with respect to which norms suit our purposes. The 
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alternative would seem to be the adoption of an essentialist view of the 
‘norm of norms,’ namely, one which takes truth as such a ‘norm of norms’ 
and therefore defines fidelity in representation in terms of some notion of 
‘the way things really look.’ Between the Scylla of historical relativism 
and mere descriptivism (in which each norm is its own warrant of 
‘correctness’ — a variant on Pirandello’s ‘‘Right you are if you think you 
are’) and the Charybdis of essentialism, there is a narrow strait which I 
hope to navigate. I confess that the difficulty is as great here, in the 
question of choice among norms of representation, as it is in the context 
of norms for choice among scientific theories, though it is a different 
question. 

I will return to this question of the criteria for fidelity in representation 
shortly. But first, let me briefly summarize my argument thus far and raise 
some questions against it, in order to go further: The visual system 1s 
evolved to perceive constancies of shape and size, i.e. to see objects and 
scenes from different viewpoints and at different distances without varia- 
tions in size and shape. The ability to attribute variation, as a result of 
changes in viewpoint or viewing distance, is an achieved ability, which 
results from the cultural and historical adoption of a particular mode of 
representation, e.g. that of linear perspective. The theory of perceptual 
constancy alleges that this ability 1s part of the physiology and the 
psychology of ‘normal’ human and animal vision. I am arguing by 
contrast, that this ability is (historically) learned, by means of an achieved 
cultural practice of representation. 

In effect, then, what I am arguing for is the inverse of the traditional 
view. The traditional view — the theory of perceptual constancy — alleges 
that the visual system receives variables and perceives constants; that is, it 
constructs (by ‘unconscious inference,’ or by some mental processing) a 
veridical picture or map of the external world, which is then imposed on 
the variations in the information which the flux of reflected light presents. 
I argue, by contrast, that the visual system is already structured to 
perceive constants, and that the additional ability to perceive variations is 
an achieved one; that is, that we learn to make inferences fo the variations 
in Shape and size, and not from them; and that this ability derives from the 
theoretical analyses of vision, which are embodied in our canons of 
representation. It is therefore because we make pictures according to the 
rules of perspective, that we learn to ‘see’ the size and shape variations of 
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objects in the visual field. To put this another way: The visual field itself — 
the space of our visual activity and of the human practice which involves 
vision — is a construct which is ordered by our practice, in particular, by 
our practice of making pictorial representations of the visual world. 

Therefore, I argue, the theory of perceptual constancy is false about 
vision per se. I should qualify this, now, and claim that the theory of 
perceptual constancy holds only for that particular mode of visual activity 
which is derived from, and dependent upon picture-perception, i.e. that 
mode which already interprets the visual world as a picture, or sees the 
three-dimensional visual world through the ‘lenses’ of a two-dimensional 
pictorial representation; in the case at issue, specifically as a two- 
dimensional pictorial representation made according to the rules of linear 
perspective. 

Now one may raise against this view at least two serious objections: 
first, how could one tell that so-called ‘naive’ vision picks up constancies 
without a transformation of variable inputs? What empirical basis could 
one adduce for the claim that the ability to notice variation, and to 
‘correct’ for it, is learned? Second, what could it mean to claim, as I do, 
that the acquired constancy transformation is the result of the adoption of 
a rule of representation which then guides our perception, i.e. of a norm, 
and not simply the (unconscious) operation of the visual system according 
to a biologically or neurophysiologically based law, or on the basis of the 
physical theory of optics? 

(1) To the first question, there is a systematic answer, and a developed 
countertheory in the work of the psychologist, J. J. Gibson.” Briefly, 
Gibson argues that the organism is evolved to pick up (visual) invariances 
from what he calls the ‘ambient light,’ and that these higher-order 
invariances are given, so to speak, in the stimulus-information itself. 
Thus, it is not variation which is given, and then transformed, but rather 
invariance. In Gibson’s view, the visual system is to be understood 
ecologically, that is to say, as the system developed for an organism which 
is active, moves about in the world, and therefore sees the world in terms 
of those invariances which are required for its activity. Hence, he argues, 
the appropriate optics for such an ecological account of vision is not 
physical optics, but what he calls ecological optics. 

Gibson’s theory is not in itself the empirical evidence for the view that 
the ability to notice shape and size variation with changing view-points 
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and viewing distances is an achieved ability. But it does offer an alterna- 
tive theoretical account which permits us to interpret the evidence in a 
different way. That is, it breaks us loose from the traditional interpreta- 
tion of the evidence which constancy theory offers, concerning the 
‘givenness’ of size and shape variation in the stimulus-information, or in 
the stimulus image. Methodologically, I want to claim therefore that it is 
not fundamentally a question of what the empirical evidence is for one 
view or the other, but rather a question of what will be taken to count as 
empirical evidence, and how this evidence will be interpreted. To put this 
matter briefly, without going into the experimental data itself: there is no 
reason to think that the visual stimulus, or ‘image’ is originally variable, 
except on the basis of a theory of visual image formation, or on the basis of 
a practice of representing and noting this variation. Variability in the 
appearances is something we introduce into the visual world, upon 
reflection. (My extended argument on this, given elsewhere, is that 
relection arises with, and is concomitant with representation.) '° More 
generally, my argument is that the distinction between ‘appearance’ and 
‘reality’ is a distinction which requires theory, and that without theory, 
the visual world offers no difference between appearance and reality: 
what appears is just what it appears as. Or to put it differently and 
pretheoretically: what is, appears as what it is. That is not to say that we 
are pre-theoretically limited only to phenomena, or appearances, and get 
‘behind them’ by means of theory; but rather, that the very distinction 
makes no sense, and does not exist, pre-theoretically. (Naive realism, for 
example, is not a mistake. It is simply naive, i.e. pre-theoretical and 
unreflective.) 

Here, the experimental evidence is fairly clear, in fact. The more 
‘naive’ the subject, the less influential is the effect of the norms of 
representation on what the subject reports as what he/she ‘sees.’ Now 
this may seem to be a vacuous claim, or a circular one, if indeed naiveté 
is defined with reference to knowledge or ignorance of the rules of 
perspective representation. But I am claiming no more than this. If, in 
fact, experimental subjects who have little or no familiarity with, or 
practice in the canons of perspective representation, report or exhibit 
that the shapes they are presented with are (more or less) invariant in 
differing presentations (and more or less so in proportion to their 
naiveté), then it would seem to me to follow that it is the educated vision 
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(e.g. of teachers of perspective drawing, in some of the experiments) 
which is prone to note the variation for which constancy is the ‘cor- 
rection.” * 

Gibson’s ‘ecological optics’ is a theory of natural vision, and not yet a 
theory of the cultural variation which I claim pictorial representation 
introduces into visual perception. Thus, though my view is compatible 
with his, at the level of animal or naive vision, it diverges from his in the 
claim that our vision, as human vision, goes beyond the framework of 
ecology, as an account of the evolutionary conditions of the development 
and functioning of the visual system, to an account of the post-biological 
or post-evolutionary — i.e. cultural and historical — conditions of the 
development and functioning of the human visual system. What I am 
offering is therefore not the experimental evidence on which my pro- 
posed theory is based, but rather a theoretical proposal which will lead to 
experimental or empirical research. 

Not that there is not a large and growing body of data, which is already 
relevant to this view. But rather, that the interpretations of this data are 
under-determined, and the type of research which is presently going on 
has not yet posed sharply enough the relation between prevailing modes 
of pictorial representation and prevailing modes of perception. The areas 
of research are in cross-cultural studies of picture-perception;*” and in 
experimental studies of perceptual interpretation of objects and scenes in 
the three-dimensional world, on the one hand, and of two-dimensional 
representations or pictures of these objects and scenes on the other. ** In 
this article, I can only point to the growing body of such studies, and note 
how vigorous and lively the present discussion and research activity is. 

(2) This brings us to the second question: What does it mean to claim 
that perception is rule-following rather than law-like in its operation? 
Most radically, this claim suggests that, since rules are created by and 
changed by historical and cultural practice, therefore, perception itself 
can change in its mode, with such variation. In short, human perception is 
historically conditioned, and not simply biologically formed. How would 
one show this? It seems to me that the whole weight of the argument, in 
the philosophy and history of science, for the view that experimental 
observation is to one or another degree theory-conditioned; that what we 
see is bound to frameworks of interpretation which predispose us to what 
there is to be seen — in short, that the general view of the comparative and 
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historical social psychology and sociology of perceptual belief and per- 
ceptual activity, argues for the conclusion that perception is rule- 
governed. One may argue, against this, that this holds perhaps for the 
higher reaches of experimental observation in the sciences, where scien- 
tific theory is involved in defining the entities to be observed, and in 
defining the means of experimental observation and measurement them- 
selves; but that this is not so, (or nor clearly or necessarily so) about 
‘ordinary’ — that is, pre-theoretical, practical, common-sense - percep- 
tion. Yet, there is enough evidence at the allegedly ‘lower’ reaches of 
perception — e.g. experiments in word-recognition, picture or object 
identification, physiognomic recognition, etc. — to show that variations in 
commonplace belief, expectation, ‘set’ (Einstellung), emotional states, 
perceptual hypotheses, — condition and change our perception. Still, one 
may argue against this view, that such variation holds at higher or more 
complex levels of perception, which involve, e.g. social belief, but not at 
the rock-botton perception of such fundamental physical and ecological 
variables as size and shape. My suggestion here is that just such funda- 
mental perceptual variables are also rule-governed. The rules are not as 
such rules for perception per se, but rather rules for pictorial representa- 
tion of what is perceived. Thus, the experimental test of such a suggestion 
as I put forth here depends on contrasting perceptual contexts governed 
by different rules. This is the point of cross-cultural studies of picture- 
perception, as well as of differential tests of subjects who are requested to 
represent pictorially what they see (or who are requested to choose 
among alternative pictorial representations). On my view, then, alterna- 
tive rules or canons of the perspective representation of shape or size, 
such as are exemplified, e.g. in pre-Renaissance Western painting, or in 
Chinese or Persian painting or in so-called ‘primitive’ art, are not simply 
ways of picturing, but also ways of seeing. What I am suggesting is that 
there is a significant connection between pictorial styles and what one 
may call ‘visual styles,’ (with respect to the variables of size and shape at 
least); and that just as representation has a (cultural) history, so too does 
vision, by virtue of its involvement with the activity of pictorial represen- 
tation. 

If this is the case, then we are confronted with (at least) two alterna- 
tives: either the history of visual styles, like the history of representational 
styles, has no intrinsic norm which determines which mode of perception 
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is more ‘correct’ or ‘truer’ than another — (different ways of seeing are 
simply historical facts about human perception, and one 1s not ‘truer’ than 
another, or ‘more correct’); or visual styles, and concomitantly represen- 
tational styles as well, are to be judged by some standard of adequacy, 
such as veridicality or the fidelity of perception (and thus too, the fidelity 
of pictorial representations of what we perceive). Let me turn, finally, 
then, to a discussion of one aspect of this question: whether, and in what 
sense, pictorial representation in accordance with the rules of linear 
perspective, is a more correct rendering of what we see than those modes 
of pictorial representation which do not follow this rule. 

What is at issue in the question whether perspective representation is 
‘correct,’ or more correct, or less mistaken than non-perspective rep- 
resentation, is whether there is some test for correctness. It is sometimes 
argued that the empirical test of the correctness of a representation, or of 
its fidelity, is recognizability. If recognizability of perspectival representa- 
tion is decisively greater than that of alternatives, then the canons of 
perspective representation are awarded the palm for fidelity. What we 
need, then, is a good experimental design, 1.e. one which is not vacuous. 
For if our experimental subjects are already predisposed to measure 
pictorial or representational fidelity by the canons or rules of perspective 
representation, then of course, our experiment begs the question, and 
decides nothing. Experimental psychologists, therefore, have chosen 
their experimental subjects in such a way as to aviod this. As I sum- 
marized some of the research, in my earlier paper, subjects of various 
sorts, from visually naive to sophisticated, from apes, infants and idiots to 
teachers of perspective drawing, were chosen. '** Goodman notes some 
older cross-cultural studies and there have been others. *° 

Let me not insist that the outcome bears out Goodman’s view, or my 
own, though I think it does. What matters is that the view that recogniza- 
bility tests fidelity of representation, is simple-minded and mistaken. For 
suppose perspective representations did turn out to be the most univer- 
sally recognizable, i.e. most successfully identifiable as representations of 
their objects, by some statistically significant measure; and that therefore, 
perspective representations were judged to be more ‘correct’ on these 
grounds than some alternative. What would remain at issue is why this is 
so. The standard view proceeds from the premise that the mirror theory 
of vision, and the concomitant theory of perceptual constancy give us a 
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true account of visual reception and perception, where the theory of 
vision tells us what is received by the eye, and constancy theory tells us 
how what is thus received is correctly perceived, by means of a transfor- 
mation. Therefore, a representation which ostensibly duplicates what the 
eye receives is ‘true’ in the sense that it yields an image which corresponds 
to that produced by the objects or scenes represented, thus preserving 
fidelity in the representation. Note, however, that such an account of 
fidelity depends on a theoretical premise, namely, the mirror-theory of 
vision. And our test of fidelity was to be empirical, 1.e. recognizability, or 
identification of the representation with what is represented. The claim 
that recognizability depends on fidelity in this sense (of perspective 
representation) 1s vacuous, however, if it can be shown that recognizabil- 
ity is equally achieved, or better achieved by other means; or if it can be 
shown that the theory which describes what the eye receives 1s false, and 
that therefore the account of fidelity, based on the duplication by 
representation of the received visual image, is itself false. 

But let us grant, for the moment, the assumption of the ‘truthfulness’ 
of the perspective representation, in the sense of a faithful reproduction 
of what the eye receives. If it is the case that some alternative mode of 
representation yields as great, or greater recognizability, without this 
kind of (perspectival) fidelity, then this fidelity in itself is neither the 
unique nor the guaranteed criterion of recognizability. Indeed, if what we 
mean by fidelity, functionally, is recognizability itself, and if degrees of 
fidelity are correlated with degrees of (experimentally testable) recogni- 
tion, then a quickly recognizable caricature would be a more faithful 
representation than a careful photograph which may yet be difficult to 
identify with its subject. (One may then even go so far as to claim that if 
recognizability is the only criterion of fidelity, then, strictly speaking, 
labels or titles on paintings would count as means of recognition, though 
they are in no way representations.) Recognizability is not the burden of a 
particular rule of representation, then, but may be achieved by alterna- 
tive rules, or may even be a function of completely extraneous, non- 
representational factors. But if this is so, recognizability as such is no test 
of fidelity in the usual sense of resemblance. '° 

Yet, we are hard put to rid ourselves of the nagging sense that 
perspective representations are after all, more faithful, truer to the way 
things really look, and more recognizable. My argument is not that this is 
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not so, but that it is so precisely because we have adopted the rule of 
perspective representation as our norm of fidelity, and not because some 
independent criterion of fidelity (e.g. recognizability) has dictated that we 
should adopt this norm rather than some other. In short, it is the choice of 
a norm of fidelity that affects recognizability. And in human perception, 
such norms are achieved, and not merely given with our physiology. 

To conclude: I have argued that the theory of perceptual constancy is 
based on a mistake. The mistake 1s that the theory supposes that what we 
receive, as visual input, is variable in the way which a given theory of 
vision describes it. But this theory of vision itself takes the visual world to 
be a picture and proposes that we see it as a picture. My counter- 
argument is that we see the visual world as a picture because we picture it 
in certain ways. And therefore, what we see becomes, in significant part, a 
function of our modes of picturing. Since these modes change, historically 
and culturally, so too does our mode of visual perception itself. Seen in 
this perspective, the theory of perceptual constancy is not simply mis- 
taken, but rather mistaken in its object, i.e. in what it is a theory about. 
Taken as a theory of our visual perception of the world, it is a mistaken 
theory, as I tried to show. Taken, however, as a theory of perceiving 
invariances in pictorial representations which are themselves made in 
accordance with the rules of linear perspective, the theory not only may 
be correct; it must be correct. For the theory itself defines and prescribes 
what the variations should be for which it provides the invariances 
through transformation. 


Boston University 


(Manuscript submitted 24 February 1976; final manuscript received 8 December 1976) 


NOTES 


1 Euclid’s Optics (c. 300 B.C.) already gives the essential theorems for these transforma- 
tions. The interpretation of the eye as camera, with the projection of an inverted image on 
the retina, by means of convergence of the light rays by a lens, is formulated by Kepler, Ad 
Vitellionem Paralipomena (1604), and Descartes (Dioptrique, 1637) among others. The 
pinhole camera obscura, which forms such an inverted image, was already used by, and 
theoretically understood by Alhazen (c. 965-1039). Kepler’s work is a commentary on 
Vitellio (13th century), whose work is a commentary on Alhazen. For an excellent account 
of the elements and a brief history of the theory of vision, see M. H. Pirenne, Optics, Painting 
and Photography, Cambridge: Cambridge University Press, 1970, chapters 1-6. 
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* The fundamental error of conceiving of the retinal image as something we ‘see,’ or which 
somehow presents itself to our awareness as an internal ‘picture’ comes from viewing the eye 
as acamera. As M. H. Pirenne points out, ‘‘we do not see our retinalimage .... As LeGrand 
has said epigrammatically, the eye is the only optical instrument which forms an image 
which has never been intended to be seen: This is the great difference between the eye and 
the photographic camera. Failure to realize this lies at the root of many misunderstandings.”’ 
(in Optics, Painting, and Photography, Cambridge: Cambridge University Press, 1970, p. 9). 
Pirenne points out that Descartes already understood that the retinal image was not itself a 
‘picture’ for us: ‘“‘Descartes in his Dioptrique had already postulated that the retinal 
excitation pattern was conveyed to the brain, so that a picture (‘une peinture’) was formed 
there, bearing a certain resemblance to that formed on the retina and therefore to the 
external objects. But Descartes himself insisted that it was not by virtue of the resemblance 
of this ‘picture’ with the objects that we see them ‘as if there were again other eyes within our 
brain with which we could see it’... .”’ (loc. cit.) Newton, however, in his discussion of Axiom 
VII, in the Optiks, seems to propose just such a view of visual perception as a kind of ‘seeing’ 
of retinal images, though his language is ambiguous, and he speaks of the retinal images only 
as the ‘causes’ of our vision. Thus, he writes, ‘In like manner, when a Man views any object 
PQR, the Light which comes from the several Points of the Object is so refracted by the 
transparent skins and humours of the Eye, (that is, by the outward coat EFG, called the 
Tunica Cornea, and by the crystalline humour AB which is beyond the Pupil mk) as to 
converge and meet again in so many Points in the bottom of the Eye, and there to paint the 
Picture of the object upon that skin (called the Tunica Retina) with which the bottom of the 
Eye is covered. For Anatomists, when they have taken off from the bottom of the Eye that 
outward and most thick Coat called the Dura Mater, can then see through the thinner coats, 
the Pictures of Objects lively painted thereon. And these Pictures, propagated by Motion 
along the Fibres of the Optick Nerves into the Brain, are the cause of Vision. For 
accordingly as these Pictures are perfect or imperfect, the Object is seen perfectly or 
imperfectly.” (Opticks, New York: Dover, 1952, p. 15.) 

3 See, for example, David H. Hubel, ‘The Visual Cortex of the Brain.’ Scientific American, 
November 1963; Stephen W. Kuffler, ‘Discharge Patterns and Functional Organization of 
Mammalian Retina’, Journal of Neurophysiology, January, 1953; David M. Hubel, ‘Integra- 
tive Processes in Central Visual Pathways of the Cat’, Journal of the Optical Society of 
America, January, 1963; D. H. Hubel and T. N. Wiesel, ‘Receptive Fields, Binocular 
Interaction and Functional Architecture in the Cat’s Visual Cortex’, Journal of Physiology, 
January, 1962; Ragnar Granit, ‘The Visual Pathway’, The Eye, Volume II: The Visual 
Process, Academic Press, 1962; J. Y. Lettvin, H. R. Maturana, W. H. Pitts, and W. S. 
McCulloch, ‘Two Remarks on the Visual System of the Frog’, Sensory Communication, The 
M.I.T. Press, 1961; S. W. Ranson (revised by S. L. Clark), The Anatomy of the Nervous 
System, Tenth Edition, W. B. Saunders Company, 1959, pp. 264-266. 

* See, for example, Pirenne’s discussion of the ‘corrections’ of the ‘distortions’ of linear 
perspective by artists, op. cit., pp. 121-135. 

> M. Wartofsky, ‘Pictures, Representation and the Understanding’, in R. Rudner and I. 
Scheffler (eds.), Logic and Art-Essays in Honor of Nelson Goodman, Indianapolis and New 
York: Bobbs-Merrill, 1972, pp. 150-162. 

© See my discussion of this in M. Wartofsky, ‘Perception, Representation and the Forms of 
Action: Towards an Historical Epistemology’, in Ajatus Vol. 36, Yearbook of the 
Philosophical Society of Finland: Aesthesis, Essays on the Philosophy of Perception, 1976, pp. 
19-43, 
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” See, for example, T. G. R. Bower, ‘Slant Perception and Shape Constancy in Infants’, 
Science, Vol. 151, pp. 832-834; ‘The Visual World of Infants’, Scientific American, Vol. 
215, No. 6, pp. 80-92; and Development in Infancy, San Francisco: W. H. Freeman and 
Company, 1974. 

8 See Robert L. Fantz, ‘Pattern Vision in Young Infants’, The Psychological Record, Vol. 8, 
pp. 43-47, and his ‘The Origin of Form Perception’, Scientific American, Vol. 204, No. 5, 
pp. 66-72. 

? See especially the following works of J. J. Gibson: The Visual World, Boston: Houghton 
Mifflin, 1950; The Senses Considered as Perceptual Systems, Boston: Houghton Mifflin, 
19 =—* ‘The Information available in Pictures’, Leonardo, 1971, 4, 27-35; and An Ecological 
Approach to Visual Perception (forthcoming). 

See also the excellent discussion of Gibson’s view in relation to these issues in Margaret 
Hagen, ‘Picture Perception: Toward a Theoretical Model’, Psychological Bulletin, Vol. 81, 
No. 8 (1974), pp. 471-497; and also the Gibsonian approach in John M. Kennedy, A 
Psychology of Picture Perception, San Francisco: Jossey-Bass Publishers, 1974. 

‘© This view is developed in an as yet unpublished series of lectures on historical epistemol- 
ogy, available in draft form upon request from the author. 

1 See R. H. Thouless, ‘Phenomenal Regression to the Real Object’, Parts I-II, British 
Journal of Psychology, Vols. 21-22 (1931); ‘Individual Differences in Phenomenal Regres- 
sion’, British Journal of Psychology, Vol. 22 (1932); and his later article, ‘Perceptual 
Constancy or Perceptual Compromise’, Australian Journal of Psychology, Vol. 22 (1972). 

'2 The literature here is vast. See the large bibliography in Margaret Hagen and Rebecca K. 
Jones, ‘Cultural Effects on Pictorial Perception: How Many Words is One Picture Really 
Worth’, in Walk and Pick (eds.), Perception and Experience (forthcoming). See also Jan B. 
Deregowski, ‘Pictorial Perception and Culture’, Scientific American, Vol. 227, (1972), pp. 
82-88; and John M. Kennedy, op cit., Chapter 5 (‘Picture Perception across Culture and 
Species’); and M. H. Segall, D. T. Campbell, and M. J. Herskovits, The Influence of Culture 
on Visual Perception, Indianapolis: Bobbs-Merrill, 1966. 

13 This is the subject of present experimental research by Prof. Margaret Hagen of the 
Department of Psychology at Boston University. 

14 See ‘Pictures, Representation and the Understanding’, op. cit., pp. 155; and H. 
Leibowitz, I. Waskow, N. Loeffler, and F. Glaser, ‘Intelligence Level as a Variable in the 
Perception of Shape’, Quarterly Journal of Experimental Psychology, Vol. 11, (1959), pp. 
108-112. 

15 For Example, M. Herskovitz, Man and His Works, New York: Knopf, 1948. Goodman 
quotes this, in Languages of Art, fn. 15, p. 15. See above, fn. 12, for reference to further 
cross-cultural studies. 

16 See Robert Schwartz, ‘Representation and Resemblance’, The Philosophical Forum, 
Vol. V, No. 4 (1974), pp. 499-511, for an excellent discussion on this point. 
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